Exploration and exploitation balance management in fuzzy reinforcement learning
نویسندگان
چکیده
This paper offers a fuzzy balance management scheme between exploration and exploitation, which can be implemented in any critic-only fuzzy reinforcement learning method. The paper, however, focuses on a newly developed continuous reinforcement learning method, called fuzzy Sarsa learning (FSL) due to its advantages. Establishing balance greatly depends on the accuracy of action value function approximation. At first, the overfitting problem in approximating action value function in continuous reinforcement learning algorithms is discussed, and a new adaptive learning rate is proposed to prevent this problem. By relating the learning rate to the inverse of “fuzzy visit value” of the current state, the training data set is forced to have uniform effect on the weight parameters of the approximator and hence overfitting is resolved. Then, a fuzzy balancer is introduced to balance exploration vs. exploitation by generating a suitable temperature factor for the Softmax formula. Finally, an enhanced FSL (EFSL) is offered by integrating the proposed adaptive learning rate and the fuzzy balancer into FSL. Simulation results show that EFSL eliminates overfitting, well manages balance, and outperforms FSL in terms of learning speed and action quality. © 2009 Elsevier B.V. All rights reserved.
منابع مشابه
Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning
Difficulty of making a balance between exploration and exploitation in multiagent environment is a dilemma that does not have a clear answer and there are still different methods for investigation of this problem that all refer to it. In this paper, we provide a method based on fuzzy variables for making exploration and exploitation in multiagent environment. In this method, an effective agent ...
متن کاملHow an Adaptive Learning Rate Benefits Neuro-Fuzzy Reinforcement Learning Systems
To acquire adaptive behaviors of multiple agents in the unknown environment, several neuro-fuzzy reinforcement learning systems (NFRLSs) have been proposed Kuremoto et al. Meanwhile, to manage the balance between exploration and exploitation in fuzzy reinforcement learning (FRL), an adaptive learning rate (ALR), which adjusting learning rate by considering “fuzzy visit value” of the current sta...
متن کاملControl of exploitation-exploration meta-parameter in reinforcement learning
In reinforcement learning (RL), the duality between exploitation and exploration has long been an important issue. This paper presents a new method that controls the balance between exploitation and exploration. Our learning scheme is based on model-based RL, in which the Bayes inference with forgetting effect estimates the state-transition probability of the environment. The balance parameter,...
متن کاملA Survey of Exploration Strategies in Reinforcement Learning
A fundamental issue in reinforcement learning algorithms is the balance between exploration of the environment and exploitation of information already obtained by the agent. This paper surveys exploration strategies used in reinforcement learning and summarizes the existing research with respect to their applicability and effectiveness.
متن کاملRealworld Robot Navigation by Two Dimensional Evaluation Reinforcement Learning
The trade-off of exploration and exploitation is present for a learnig method based on the trial and error such as reinforcement learning. We have proposed a reinforcement learning algorism using reward and punishment as repulsive evaluation(2D-RL). In the algorithm, an appropriate balance between exploration and exploitation can be attained by using interest and utility. In this paper, we appl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Fuzzy Sets and Systems
دوره 161 شماره
صفحات -
تاریخ انتشار 2010